High-Dimensional Data Cubes
نویسندگان
چکیده
This paper introduces an approach to supporting high-dimensional data cubes at interactive query speeds and moderate storage cost. The is based on binary(-domain) that are judiciously partially materialized; the missing information can be quickly reconstructed using statistical or linear programming techniques. enables new applications such as exploratory analysis for feature engineering other fields of science. Moreover, it removes need compromise when building a cube - all columns we might ever wish use included dimensions. Our also up certain dice, roll-up, drill-down operations with hierarchical dimensions compared traditional cubes.
منابع مشابه
Approximate Query Answering in High-Dimensional Data Cubes
Data mining work has been successful in both compressing and modeling large data sets with many values. However, the problem of high-dimensions has not been sufficiently addressed. In our work, we develop a new data reduction method that aims to speed subsequent data analysis by efficiently constructing a high-dimensional, joint probability distribution. This distribution summarizes the data by...
متن کاملMaximal inequality for high-dimensional cubes
We present lower estimates for the best constant appearing in the weak (1, 1) maximal inequality in the space (R, ‖ · ‖∞). We show that this constant grows to infinity faster than (logn)1−o(1) when n tends to infinity. To this end, we follow and simplify the approach used by J.M. Aldaz. The new part of the argument relies on Donsker’s theorem identifying the Brownian bridge as the limit object ...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملHigh Performance Data Mining Using Data Cubes on Parallel Computers
On-Line Analytical Processing techniques are used for data analysis and decision support systems. The multidimensionality of the underlying data is well represented by multidimensional databases. For data mining in knowledge discovery, OLAP calculations can be effectively used. For these, high performance parallel systems are required to provide interactive analysis. Precomputed aggregate calcu...
متن کاملMining Multi-Dimensional Constrained Gradients in Data Cubes
Constrained gradient analysis (similar to the “cubegrade” problem posed by Imielinski, et al. [9]) is to extract pairs of similar cell characteristics associated with big changes in measure in a data cube. Cells are considered similar if they are related by roll-up, drill-down, or 1-dimensional mutation operation. Constrained gradient queries are expressive, capable of capturing trends in data ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2022
ISSN: ['2150-8097']
DOI: https://doi.org/10.14778/3565838.3565839